Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
stream-json
Advanced tools
stream-json is the micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory streaming individual primitives using a SAX-inspired API. I
The stream-json package is a Node.js library that provides a stream interface for parsing and stringifying JSON data. It allows for processing large JSON files or streams in a memory-efficient and time-efficient manner by working with JSON data in chunks rather than loading the entire file into memory.
Parsing JSON
Stream-json can parse JSON files of any size by breaking them down into smaller chunks and processing them one by one. This is particularly useful for working with large JSON files that cannot be loaded into memory all at once.
{"_readableState":{"objectMode":true},"readable":true,"_events":{},"_eventsCount":1}
Stringifying JSON
The package can also stringify large JSON objects by converting them into a stream of data. This allows for efficient writing of JSON data to a file or over the network.
{"_readableState":{"objectMode":true},"readable":true,"_events":{},"_eventsCount":1}
Filters and Transforms
Stream-json provides filters and transformation tools to selectively process and modify JSON data as it is being streamed. This can be used to extract or transform specific parts of the JSON data without having to manipulate the entire dataset.
{"_readableState":{"objectMode":true},"readable":true,"_events":{},"_eventsCount":1}
JSONStream is a package similar to stream-json that offers streaming JSON.parse and stringify. It is widely used and has a simple API, but stream-json provides a more modular approach with plugins and a richer set of features for filtering and transforming data.
big-json provides similar functionality for parsing and stringify large JSON files. It uses a streaming approach to handle large files, but stream-json has a more comprehensive set of tools for dealing with streams and allows for more complex processing pipelines.
stream-json
is a micro-library of node.js stream components with minimal dependencies for creating custom data processors oriented on processing huge JSON files while requiring a minimal memory footprint. It can parse JSON files far exceeding available memory. Even individual primitive data items (keys, strings, and numbers) can be streamed piece-wise. Streaming SAX-inspired event-based API is included as well.
Available components:
Pick
, or generated by other means."{}[]"
), or with white spaces (like in "true 1 null"
).utf8
text input.StreamValues
.
Parser({jsonStreaming: true})
+ StreamValues
.Disassembler
+ Stringer
.All components are meant to be building blocks to create flexible custom data processing pipelines. They can be extended and/or combined with custom code. They can be used together with stream-chain to simplify data processing.
This toolkit is distributed under New BSD license.
const {chain} = require('stream-chain');
const {parser} = require('stream-json');
const {pick} = require('stream-json/filters/Pick');
const {ignore} = require('stream-json/filters/Ignore');
const {streamValues} = require('stream-json/streamers/StreamValues');
const fs = require('fs');
const zlib = require('zlib');
const pipeline = chain([
fs.createReadStream('sample.json.gz'),
zlib.createGunzip(),
parser(),
pick({filter: 'data'}),
ignore({filter: /\b_meta\b/i}),
streamValues(),
data => {
const value = data.value;
// keep data only for the accounting department
return value && value.department === 'accounting' ? data : null;
}
]);
let counter = 0;
pipeline.on('data', () => ++counter);
pipeline.on('end', () =>
console.log(`The accounting department has ${counter} employees.`));
See the full documentation in Wiki.
Companion projects:
stream-json
:
rows as arrays of string values. If a header row is used, it can stream rows as objects with named fields.npm install --save stream-json
# or: yarn add stream-json
The whole library is organized as a set of small components, which can be combined to produce the most effective pipeline. All components are based on node.js streams, and events. They implement all required standard APIs. It is easy to add your own components to solve your unique tasks.
The code of all components is compact and simple. Please take a look at their source code to see how things are implemented, so you can produce your own components in no time.
Obviously, if a bug is found, or a way to simplify existing components, or new generic components are created, which can be reused in a variety of projects, don't hesitate to open a ticket, and/or create a pull request.
stream-chain
), bugfix: inconsistent object/array braces. Thx Xiao Li.utils/Utf8Stream
to sanitize utf8
input, all parsers support it automatically. Thx john30 for the suggestion.jsonl/Parser
and jsonl/Stringer
.The rest can be consulted in the project's wiki Release history.
FAQs
stream-json is the micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory streaming individual primitives using a SAX-inspired API. I
We found that stream-json demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.